Semi-continuous segmental probability modeling for continuous speech recognition

نویسندگان

  • Jiyong Zhang
  • Fang Zheng
  • Mingxing Xu
  • Ditang Fang
چکیده

In this paper the design of semi-continuous segmental probability models (SCSPMs) in large vocabulary continuous speech recognition is presented. The tied Gaussian densities are trained using data from all states of all utterances while the mixture weights are estimated using data from the state being trained individually. The SCSPMs tie all the densities of all states from all Speech Recognition Units (SRUs) to form a shared pdf codebook, thus the number of Gaussian densities is greatly reduced. Several pruning methods are reviewed and then a new pruning criterion is proposed in order to reduce the number of tied mixture Gaussian densities while there is only a small subset of mixture Gaussian densities with larger tying weights. Our preliminary experiments show that the SCSPM incorporated with the pruning techniques can lessen the size of model storage and speed up the system with little degradation in the accuracy compared to the prior continuous model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-continuous segmental probability model for speech signals

A semi-continuous segmental probability model, which can be considered as a special form of continuous mixture segmental probability model with continuous output probability density functions sharing in a mixture Gaussian density codebook, is proposed in this paper. The amount of training data required, as well as the computational complexity of the semi-continuous segmental probability model(S...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

A segmental approach to text-independent speaker verification

Current text-independent speaker veri cation systems are usually based on modeling globally the probability density function (PDF) of the speaker feature vectors. In this paper, segmental approaches to text-independent speaker veri cation are discussed. Unlike the schemes based on Large Vocabulary Continuous Speech Recognition (LVCSR) with previously trained phone models, our systems are based ...

متن کامل

Multiple codebook semi-continuous hidden Markov models for speaker-independent continuous speech recognition

A semi-continuous hidden Markov model based on the multiple vector quantization codebooks is used here for large-vocabulary speaker-independent continuous speech recognition. In the techniques employed here, the semi-continuous output probability density function for each codebook is represented by a combination of the corresponding discrete output probabilities of the hidden Markov model and t...

متن کامل

Segment-Based Acoustic Models for Continuous Speech Recognition

ity or acoustic observations conditioned on the state in Tied-mixture (or semi-continuous) distributions are an imhidden-Markov models (11MM), or for the case of the portant tool for acoustic modeling, used in many highSSM, conditioned on a region of the model. Some of the performance speech recognition systems today. This paper options that have been investigated include discrete dispiovides a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000